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PROBLEMS OF IDENTIFICATION 



D. P. CAVER 
P. A. JACOBS 
I. G. O'MUIRCHEARTAIGH 
A. MELDRUM 

1 . Formulations; Background and Introductory Comments 

Envision this abstract situation. There are J items, distinct from one 
another and bearing names. Think of them as BIRDS of different types. Each 
one is characterized by a vector of identifying components (you can possibly 
think of physical characteristics such as color, flight speed, 
characteristics of song, etc. as these parameters). In addition, it may (or 
may not) be known where the items are located geographically; they 
occasionally move, and may move together in suitable flocks. 

Next, there, is a group of individuals, called WATCHERS, who are 
sensitive to the parameters (physical characteristics) just mentioned, when 
the latter become evident. In effect, some bird may sing his song, and one 
or more of the WATCHERS will hear and record various features of the 
song, but with error; the same with other features or parameters. 
Observations are made "in the dark:" observing WATCHERS cannot see the BIRDS 
before the song and other qualities become evident. In fact, the objective 
of the group of individuals is to collectively identify the BIRDS in 
question as well as possible just by comparing notes on the parameter 
announcements, e.g. song and other feafure characteristics. 

Errors of various types are easily made, depending upon the operating 
characteristics of the WATCHERS and on the distribution of the parameters. 



Perhaps a group of BIRDS will all sing, and two WATCHERS will confuse two or 
more BIRDS whose songs they hear: both will state their estimates of "the” 

song length of what they take to be a single BIRD, whereas in fact, they 
have focused on two different BIRDS with similar-enough songs. 

If the individual assessments are error-prone (as they will be) and if 
the distribution of the vector parameters is unfortunate, being tightly 
concentrated around a point in p-space (parameter space) the advantage of 
the WATCHERS is minimal: they will be unable to accurately discern a 

particular BIRD'S presence, much less how many BIRDS are singing. If 
several WATCHERS are responding to two different BIRDS, their composite 
single assessment of the parameter may fail to conform to anything real. 

The general problem is to identify singing BIRDS using error-prone and even 
gross error (outlier) prone observations. 

With this as background we begin to formulate a variety of simple 
problems and to consider their implications. 

2. The Single Item - One Parameter Case 

To get started, focus on the information available from an announcement 
(song) by a single BIRD. Call the parameter value 0, and focus first on 
estimating 0 from observations made by I WATCHERS on announcement of 0. Now 
it may actually be known that if 

0 - U. j-1 ,2,...J (2.1) 

xj • 

th ^ 

the BIRD is the J of a group, and is named George; if the estimated 0, 0, 

actually is very near y j , then we announce confidently that we have heard an 

announcement by George the robin ("by George, I think I heard him"). Things 



- 2 - 



might not be quite so simple: the actual parameter announced may be 

distributed somewhere near the value Uj . Increasing the spread or 
variability of announced values of 0 around y. will be confusing to each 

\J 

WATCHER, and the announcement that we indeed heard George himself becomes 
less likely to be true. 

2. 1 A First Step: Likelihood 

Suppose each individual, reacting to announcement (song), estimates or 

^ hi 

quotes a ©-value (for the i^^), and further it is known that 

P{X^ e(dx^)|0} » f^(x^;0)dXj^. (2.2) 

An important special case is that errors are normally distributed (or 
Gaussian ) : 



X **© 

f (x ;0) - exp[- 1 ) ] — ] — , (2.3) 

^ ^ ^ ‘'i 0 ^ 



so WATCHER i (i»1 ,2, . . . , I) estimates the value of the parameter 0 with 

errors that are N(0,ap. For the moment, assume that there is just the one 

BIRD present. If all I WATCHERS independently estimate 0 and do so with 

independently distributed errors, then it makes sense to write down and 

examine the likelihood function 

I 

L(0;x) - n f^(x^;0) (2.4) 

i“1 
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or, taking logs and concentrating on the Normal form, 



I 

il(0;x) - I In f (x ;0) 
i-1 




2 

omitting irrelevant constants. Now with known the unrestricted 
maximization of the likelihood produces the time«honored formula 



(2.5) 



- A 

0 - ( 2 . 6 ) 

I I/O? 

i-1 • 

i.e. the variance-weighted mean of the individual observations. The 
variance of the estimate is 



Var[0] - j — ] . (2.7) 

I 1 / 0 * 

i-1 • ^ 

If all the above assumptions hold true, then one presumably compares 0 to 
the "known” parameters of various BIRDS and picks BIRD j#, where 
|0 - < |0 - Pj 1 , call this the nearest neighbor strategy, NN. 

Because of symmetry, the solution 0 is also the mean of a Bayes posterior 
with non- informative (flat, improper) prior. It is also the best linear, 
unbiased (BLUE) estimator of 0, so it should be at least mildly satisfactory 
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to most non-rabid statisticians of any faith or persuasion. In the 
important (oversimplified) case in which each individual bird sings 
precisely on key so the unknown 0 - Pj for some j then the procedure yields 
the maximum likelihood estimator of y from the restricted parameter space 
{y, ,y_, ...» y,}. See Hammersley (1950) for an early discussion of this 

} CL O 

problem. 

2. 2 A More Robust Likelihood 

While many (perhaps transformed) measurement errors of physical 
quantities are approximately Normal, especially "in the middle" of their 
distribution, there can well be occasional outliers, in this case possibly 
caused by individual mis-performance. In order to model this empirically 
observed feature, it is becoming conventional to extend the tails of the 
Normal in (2.3) in one of these ways 

(a) continuous scale mixing, where is taken to be a conveniently 
distributed (e.g. inverse Gamma) random variable. 

(b) e contamination, wherein 



X -0 



f,(x ,0) - (1 - e ) exp[- ^ (-^ ] ] — 1- 

^ ^ ^ ^ °i1 /27 a 

1 ^ 1 
. exp[- i (-f-) ] ^ 

'^12 f/FH 



i1 



( 2 . 8 ) 
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in which usually 1 - is close to one and is relatively 

small > 0 is close to zero, but is 

large, e.g. “ ^®®ii* This model was utilized in a classical 
robustness context by Tukey, and also by Berger and Berliner 
(1983) for Bayesian robustness purposes. 

Begin by discussing- (a). This approach can (but need not) lead to 
replacement of the normal observation density by a Student t form: 



f^(x^;0) 



C(d^) 

X -0 2 , (d,-H)/2* 

[w(^) 3ll,‘ ■ 

i 1 



(2.9) 



Here view d^^ as a shape tuning parameter; Var[X^] - ^ if d^ > 2, but 

kurtosis (fourth central moment scaled to be dimension-free) can Induce very 
extended tails, simulating outlier occurrence. If d^- 1, we get the 
centered and scaled Cauchy, with notoriously long, symmetric tails. The 
Cauchy tails are so long that neither mean nor variance — nor any other 
moment — exists. The likelihood obtained by combining individual measures 
is now 



I 

L(0;x) - n 
i-1 



C(d^) 



[1 



X -0 2 , (d. -M )/2 

(— ] —] ^ • 



and so, up to irrelevant constants. 



( 2 . 10 ) 
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I d +1 X -0 2 

an L(0;x) = Z(0;x) - I ln[l + (“T— ] ;r^]. 



i-1 






Now differentiation with respect to 0 gives 



— - I 

90 2 



2(x -0) -27 
i ofd 



i i 



X -0 2 



( 2 . 11 ) 



"i ' 



as a condition for a maximizing 0, denoted by 0. In principle this equation 
could have more than one solution; Copas has discussed this situation. 

To obtain a usually sensible (optimal) solution, proceed as follows: 
Iterative Reweighting 



Rewrite (2.11) as follows: 

II - 0: (x^- 0 (r+ 1 ) • w^(r) » 0 



or 



0(r+1 ) 



I X 

I (-4) w,(r) 
i-1 i 



I- 



I (rr) w/r) 

i-1 i 



( 2 . 12 ) 



(2.13) 



where the weight 



di>1 



Wj^(r ) 



X -0(r) 2 



(2.14) 
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One might start the iteration at 



0(1) - median .... Xj). 



(2.15) 



and then compute the first weight 



di^l 



w^(1) 



X -0(1) 2 

— -) 

■ “i “i 



( 2 . 16 ) 



and use this to find the second estimate 0(2). Even the first iteration, as 
described, will be quite successful at taming down individual widely 
discrepant values, or "outliers". The smaller is d^ (d^ S 1, presumably) 
the more effectively discrepant values are reduced in influence. 

After obtaining convergence, one may apply the NN approach to identify 
the name (number) of the BIRD actually singing. 

The above procedure will usually work satisfactorily, but may err 
because of an unfortunate starting estimate. If each BIRD sings almost 
precisely on key, so the unknown 0»y. for some j, then a precise maximum 

%J 

likelihood solution can be obtained by simple enumeration: one simply 

evaluates (2.10) for 0 - {y, ,y«, ..., yJ and picks that gives the 

maximum. For small J this is computationally feasible and provides the 
truly maximum likelihood solution given the problem formulation. On the 
other hand, the weights produced by the iterative solution provide a 
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convenient index as to the relative importance to be attached to the data 
values by the iterative procedure, so it seems sensible to become a "weight 
watcher". The pattern of weights might suggest reasons for relative comfort 
or discomfort with an identification: for instance, a relatively uniform 

distribution of low weights perhaps gives discomfort, while mostly high 
weights with a few very low ones thrown in may give reason for comfort — 
presumably with a consensus of the high-weighted values. One fact that 
should be noted is that the likelihood equation (2.10) may well have 
multiple peaks or modes, and the primary one is presumably usually found by 
the re-weighted iteration NN scheme suggested. In any case, the parameter 
space point-by-point enumeration is often feasible. 

Next discuss the e-contamination model (b). Unfortunately the 
likelihood is of an awkward form 

2 

I _ , X -0 

L(0;x) - n [e exp[- i (-i- — ) ] 

i-1 ^ 2 °i1 

+ e exp[- i ] — T— ]• (2.17) 

^ °i2 /2i a^2 

Now it can be seen that if the multiplication is carried out and some re- 
arrangement is done, we can express the likelihood as 



where 



iTj^Cx) - exp{-^ (2.19) 

and K - 2^. It turns out that U|^(x) Is a linearly weighted function of the 

2 

individual observations, and 1 / 0 j^(x) is a corresponding sum of inverse 
variances; R^(x) measures the discrepancy of the individual observations 
from U|^(x)* 

For illustration, suppose 1*2, so, up to multiplicative constants. 



L(9,x) 



1 



X -0 



t * =1 (7^) I ^ 

i1 /2ir a._ i2 



/2ir 0 



i1 



i2 



4 0-y (x) , 

I »|^(x) exp[-j ( (,) - ) ] ^ 



k-1 



k'-' /2ir aj^(x) 



where {ir|^} » ^1^2' ^1^2’ ^1^2^*’ terms U|^(x) and Oj^Cx) are 

obtained by completing the square in the exponent of each summand. 

The form of (2.18) suggests that L(0;x) is a possibly multimodal 
function, as was true of the Student t likelihood. An Iterative scheme can 
be set up as detailed below to estimate 0 and the NN approach can then be 
taken. If each BIRD sings almost precisely on key so the unknown 0 » for 
some j, then a precise maximum likelihood solution can be obtained by simple 
enumeration as before. 
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Iterative Rewelghtlng 



Taking logarithms of (2.17) and differentiating with respect to 0 
results in the equation 




I (X -0(r+D) [(— )^ + w (r)C(-^)^ 
i-1 ' ■<’12^ ‘’il °i2 



where the weight 



“l<r> 

with 



Aj^(r ) 

7p~A^(r ) 



( 2 . 20 ) 



( 2.21 ) 



A^(r) - (1-Ej) 0^2 ®xp(- ^ (x^-0(r))^ 



( 2 . 22 ) 



'i1 



As for the Student t distribution, one might start the iteration at 0(1) 



median (x^ .x^, . . .x^) and then compute the first weight; use the weights 
to find the second estimate; etc. 



3. Bayesian Formulations; Everything Normal 



In addition to the Information on 0 coming from Individual 1 there 

may be Information on 0 codable In the form of a probability density; 

P-(0). The latter may actually take the form of a series of near delta 
0 

functions, one for each of the BIRDS In question. For the sake of a bit of 
generality write 



J . 0-u. 

foO) ■ I P.supl- i 1 

? J-1 J ^ 'j 






(3.1) 



Here possibly P. =» — , where J represents the number of BIRDS believed to be 
J J 

In the vicinity and of Interest. If t. ■ 0 then the above Indeed represents 

a "Dirac comb" with teeth at the points p . , j-1,2,...,J; the sharpness of 

J 

the teeth dictated by Tj : small Xj means that the j tooth (density) Is 

long (tall) and sharp. 

Now by routine Bayes we get for the posterior density 



P 




( 0 ) 



I 

K n f„ (x.;0)P (0) 
1-1 1 . 



(3.2) 



and, If we adopt the Normal model. 



I X, rt 0 2 J 



e - W. 2 



K exp[- j I ] ] • I P. exp[-| ( — - — i) ]. (3.3) 

i-1 i j-1 J 



This can be simplified: write 



0 - w.(x) 2 0 - y, 2 I X.-0 2 



( 3 .^) 



where Q. doesn't depend on 0, and look for y.(x) and (x) in terms of the 
J J d 

observations and parameters; to do so differentiate with respect to 0 to 
find 



0 - y.(x) 



9 - U I 

2 (— • 2 I 

J i»1 




(3.5) 



Since the coefficients of 0 and 1 on each side of the equation must match, 



and 



7]T7) 




(3.6) 



y .(x) 

J 





T^(X). 

0 



(3.7) 



To identify Q. let 0 =» y.(5c) in (3.^); then 
J J 



y.(x)-y 2 I X -y (x) 2 

Q. - ^ ] - I ; 

j i-1 i 



(3.8) 
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t h 

that is, Q. is the scaled sum of squared deviations of (a) the j posterior 

t h 

mean from the j prior mean and (b) the j posterior mean from each 
individual observation. Now return to (3.3) and substitute: 



’’elx 



©-y.(x) 2 



'J, '“j 5 ^ 



(3.9) 



By normalization, 

' ■ L ’'elx 

J . ' f® 1 0-U.(x) 2 .. 

■ ■'jl, '’j"”"'- 2 L 5 ‘-rjiir) I • ^(x) <3.i0) 

J- . 

- K I P^exp[- 2 Qj ] / 2ir Tj(x)* 



Thus the posterior density is of the form 



J « . 0-M (x) 2 

P-i ,( 0 ) - Pj^x) exp[- 1 ( — :;477 t) ] 



0lr 



where 



2 '^T.(x) 

>3 



(3.11) 



P. (x) 

0 



. _Li!__L_Li1£ 

J - Q. 

I P.e ^ T.(x) 

J-1 ^ ^ 



(3.12) 
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In other words by completing the square in (3.^). the resulting form of the 

posterior density (3.12) and the prior density, (3.1), resemble each other 

2 

closely. For the important special case in which Tj - 0, j-1,2 J, one 

obtains the discrete distribution concentrated at the values Uj 
(j-1 , 2, . . . , J) having probability mass function 



Pj exp[- J I ( “ Pj)^/o^] 

P{0 - ujx^ ” ” J f*“^ • (3.13) 

I P, exp[-l I (x - uj"/af] 

The component probabilities Pj are very simply modified in accordance with 
the observations, and can be easily updated as more observations become 
available. One could hope that after a set of observations has become 
available then, say, 




and 

Pj - 0 for j 3, 

which points strongly at BIRD 3 as being the one that is actually singing. 

If, on the other hand, all P. -values were to remain similar it might be 

j 

thought that some individuals have focused on two or more different items, 
or that the noise is not well represented by the normal (Gaussian) model. 
This possibility is not included in the present model, however. Note, too, 
that if the {P^ = ^, j - 1, 2, ..., J) , the discrete 'uniform distribution, 
then naming j-j^^ by picking the maximum probability from (3.13) is exactly 
equivalent to maximizing the likelihood by direct enumeration. 
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3.1 Bayesian Formulations; (1) Student t Observations 

Let the prior information on BIRDS be given by (3*1 )• Individual 
watchers independently observe 0 with errors distributed according to the 
Student t family: 






C(dj) 



[1 



X. -0 






(d^ + 1)/2 



1 



0 . 

1 



(3.14) 



Then the posterior density is of the form 




I J . 0-M. 2 

K n f (x ;0) I P exp[- -r ( — ] 
i-1 i j-i J 



(3.15) 



J 0-u 2 I 

- K’ I P exp[- J ( — ) ]exp[- ^ I (d +1 ) ln[1 
j-1 J j i-1 



X.-0 2 




In order to normalize this expression (determine K'), and to compute moments 
(for point estimates, the NN approach, etc.) it is necessary to integrate 
over all 0-values; of course no simple analytic closed form expression 
exists. There are two practical options: 

(a) numerical integration , using Gauss-Hermite integration, e.g., by 
adopting the program of Naylor and Smith (1982); or 

(b) analytical approximation , using a variant of the Laplace method, 
see deBruiJn (1958) or the equivalent; this classical approach has 
been invoked by Hosteller and Wallace (1964), Gaver (1985), 
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! 

Tierney and Kadane (1986), Lindley and Singpurwalla (1986), and by 
many others as well . 

To apply Laplace, write 






K I 
J-1 



P.e 

J 






(3.16) 



where 



0-u. 2 I 

* I (d + 1) ln[l 
i-1 " ■ 






2 

) • 




(3.17) 



The plan is to replace Sj by an approximating quadratic in 0, and thus to 

exhibit closed-form approximating expressions for the updated probabilities 

pf(x) that are quite analogous to the "exact" formulas (3.12) obtainable 
J 

under norraal/Gaussian error specifications. 

To proceed, assume that the exponent is of the form 



S. 

J 



0-p (x) 2 



tj(x) 



. Qj (X) 



( 3 . 18 ) 



where 0.(3^) is st least nearly independent of 0. 

sj 



Now differentiate the two 



forms on 0: 



(3.19) 



0-y .(x) 



0-M, I d +1 

2 (-^) * I 2 (-i-) 

i-1 i 



(0-x^) 



X -0 2 . 



Now identify the coefficients of 0 and 1 to see that 



TJ(X) 



i-1 ®i 



(d^ + 1)/d^ 

X -0 ^ . 

t' * (4r) 3^1 



~ 4- ^ I 4- w. 

lit "i ‘ 



and 



UjCx) 



I *4 



” [4 ^ 4 '^i^ 

j i-1 i 



where the weights are 



w. - 



(d^ + 1)/d^ 

X -0 ^ 

1’ * (4r> 3^1 



In practice, it will be necessary to estimate 0 by an iterative re- 
procedure, so approximate weights will be used: 



w. 



(d^ + 1)/d^ 
X.-0 ^ . 

t' ‘ 3 t1 



Now replace S. in (3.16) by the quadratic approximation to find 
J 



( 3 . 20 ) 



( 3 . 21 ) 



( 3 . 22 ) 



weighting 



(3.23) 
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(3.2i<) 



I 



J 4 /s , 0-U.(x) 2 

'’elx 2 2 5 



where the approximate prior probability revision factor is from (3«18) 



li.-u.(x) 2 I 



x.*»u4(x) 2 



, U) . * I (a . ,) l„[, * ( t j . 

J i-1 ^ 



(3.25) 



and so 



. P e** T (x) 

P,(x) - J 



J - J . 1 



( 3 . 26 ) 



I Pe’^j^i\.(x) 
J»1 ^ ^ 



provides the approximate data-updated probability that BIRD j is singing. 

# ''U ''it it 

It is reasonable to designate j-j if Pj^^(x) > Pj(^) • The form of 

the Bayes posterior is of course quite analogous to that derived for the 
normal errors case. Here we have 



P 




I P*!(x) exp[- ^ ( 

J“1 ^ 



0-U.(x) 2 
J— — ) 1 

T.(x)^ ‘ 

yJ 




(3.27) 



with the squared-error^mlnimlzing point estimate, i.e., the expected value 
of 0 given observations x, is 
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It Is this value that should apparently appear in the weights, , in the 
course of the iterative calculation. 

A 

The behavior of Qj (x) » af'd hence of the prior to posterior revision 

Pj -*• seems intuitively appealing: first, one is lead to estimate the 

most likely characteristic of the song of the j BIRD, y.(x), by combining 

d 

data , i ■ 1, 2, ..., I (using the knowledge that outliers will occur, so 

the estimate is made robustly) and prior information about the variability 

of BIRD j’s song. Then this estimate is effectively compared to (1) the 

t h 

candidate true mean value of J BIRD'S song, y., and (2) the data obtained 

J 

by the listeners; both (1) and (2) measured on an appropriate scale of 
variability. If the sum of these distances (squared and scaled) is small, 
the conditional probability of J being the songster is correspondingly 
increased; otherwise, it is reduced. 

If Tj « 0, j-1,2,...,J, so the BIRDS always sing precisely on key, then 
the above density becomes a probability mass function: 



P^|(x) - P(e-j|x} 



K P 



I 

i " 
'i-1 



C(d^) 



^2 ( d ^^ 1)/2 

^ “i ^ ‘^l^ 



(3.29) 



[1 



where 



(3.30) 



1 



J I 

K I p n — 

j-1 Ji-1 



C(d^) 




2 

) 




( d ^+ 1 )/2 



determines K. To identify the optimal j-j , simply locate the maximum Pj . 

4. Results of Simulation Experiments 

This section reports some of the results of simulation experiments to 

study the performance of various methods of combining WATCHER observations 

to obtain an estimate of the parameter of the singing BIRD. All simulations 

were carried out on an IBM 3033AP at the Naval Postgraduate School. Random 

numbers were generated using IMSL routines. Some details and results of the 

simulations are given below; for more see Mel drum [1986]. 

4. 1 BIRDS with Univariate Parameters 

There are 5 BIRDS with parameters |iij} equal to 1 , 2, 3. 4 and 5. The 

BIRD that sings has parameter p. with probability p.. 

.0 J 

The number of WATCHERS varies between 2 and 5. The observation of the 



1^*^ WATCHER is 

- 0+E^ (4.1) 

where 0 is the parameter of the BIRD that sings and is the observational 
error. The distributions of observational error considered are: 

1) the normal distribution with mean 0 and standard deviation o =» 0.5 
(e.g. (2.3)); 

2) the e-contaminated normal (2.8) with mean 0, standard deviations 
0 ^- 0.5 and » 5, and contamination probability e - 0.1 and 0.25; CN(e); 
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3) the Student t distribution (2.9) with o » 0.5 and d » 1, which is 
the Cauchy distribution. 

Each simulation case has 10,000 replications. In each replication the 

BIRD with parameter u. was drawn to sing with probability p. and WATCHER 

J J 

observations were generated. The following estimates of the parameter of 
the singing BIRD 0 were computed: 

1. the mean of the observations; 

2. the median of the observations; 

3. the iterative Student-t estimate (2.12)-(2.16) with assumed values 
a - 0.5 and d - 1 and 10; 

4. the iterative e-contaminated normal estimate (2. 20)- (2. 22) with 

assumed parameter values ■ 0.5, ” 5.0 and e - 0.1 and 0.25. 

In each case, the BIRD whose parameter was closest to the estimated 9 

was estimated to be the BIRD that sang. 

In each replication Bayes procedures for combining WATCHER observations 

were also considered. The prior probability of the BIRD with parameter u. 

J 

singing was assumed to be 1/5 in each case; (equally likely prior). The 
assumed error distributions for the Bayes models were as follows: 

1. normal with mean 0 and standard deviation o - 0.5; 

2. Student t with a - 0.5 and degrees of freedom d =» 1 , and 10; 

3. The e-contaminated normal with - 0.5, • 5 and e - 0.1 and 

0.25. For each assumed error distribution the posterior probability of the 
singing BIRD having parameter u was computed. 

J 

The BIRD whose parameter had the largest posterior probability was the 
estimate of the BIRD that sang. 
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In Table 1 proportions of correct identifications are given for a 
simulation in which BIRD with parameter Pj sings with probability 1/5. The 
number of WATCHERS varies between 2 and 5. Observation errors were 
generated using the "true error distribution". Estimates of 0 were computed 
using both correct and incorrect assumptions given in the first column 
concerning the error distribution. The Bayes estimates assumed the equally 
likely prior and correct and incorrect assumptions given in the first column 
about the error distribution. 

Here are some conclusions that may be made from the simulations. The 
' BIRD estimates based on the mean and the normal Bayes model are the most 
sensitive to incorrect error distribution assumptions; note that in the case 
in which the true error distribution is c-contaminated normal with e » 0.25, 
increasing the number of WATCHERS actually decreases the proportion of 
correct identifications for these two procedures. This behavior results 
from the fact that, with small numbers of WATCHERS, increasing the number 
WATCHERS increases the chances of having one or more outlying observations. 

A more detailed explanation of this phenomenon can be found in the Appendix. 
All of the procedures do about the same when the true error distribution is 
normal. When the true error distribution is not normal, the Bayes estimates 
based on the correct prior of equally likely BIRDS and error distribution 
other than normal or Student t with 10 degrees of freedom tend to have the 
highest proportion of correct identifications. 

In Table 2 the proportions of correct identifications are given for a 
simulation experiment in which the BIRD with parameter p » 1 always sings in 
each replication; this parameter is on an extreme of the parameter set. In 
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Table 3. the proportions are given for an experiment in which the BIRD with 
parameter y ■ 3 always sings; this parameter is in the middle of the 
parameter set. The other parameters in the simulations remain the same as 
those for the simulation in Table 1. In particular the Bayes estimation 
procedures use the (incorrect) prior of equally likely BIRDS. 

Comparing Tables 1-3, the proportion of correct identifications is 
smallest (respectively largest) in the case in which BIRD 3 (respectively 
BIRD 1) always sings; that is, the position of the parameter of the singing 
BIRD within the parameter space can make correct identification easier or 
harder. Once again procedures based on the normal distribution (mean and 
normal Bayes) do well if the true error distribution is normal but tend to 
have smaller proportions of correct identifications if the true error 
distribution is not normal; this decrease in the proportion of correct 
identifications is greater than the decrease obtained by using, procedures 
based on non-normality of the error distribution when in fact the 
observations have normal errors. In Tables 2 and 3 the effect of using 
incorrect prior distributions in the Bayes models has been to make their 
proportions of correct identifications closer to those of the parametric 
procedures. However, the Bayes procedure based on an error distribution 
Student-t with 1 degree of freedom appears to be quite robust to model 
assumptions particularly for the case of 2 WATCHERS. 

^.2 BIRDS with Bivariate Parameters 

In this subsection there are 5 BIRDS. Each BIRD has two parameters 
associated with it. Two configurations of the BIRDS' parameters were 
considered: 



_2H- 



I.INE: - (1,1), " ^2,2), - (3,3), y^ - (i»,4), y^ - (5,5); 

BOX: y^ - (2,2), y^ - (2,4), y^ - (3,3), y^ - (4,2), y^ - (4,4). 

The observation errors have the following distributions 

1. e -contaminated bivariate normal with density function 

f(x,y) - (1-e) — y - f - exp { -Q (x,y)} (4.2) 

1,1 1,2 ; ^1 



+ e 



'•2.^ ”2,2 



J i exp I -'52<X’J'>I 



where 



2(1-pj^) Q^(x,y) 



)^-2p (£ )(I ) .(I )* 

°i1 ^ ®i1 °i2 i2 



The parameters used are ^ - o,| ^ • 0.5; ^ ” °2 2 ” 

p^ ■ p,| ” *0.5, 0, 0.5, and e - 0, 0.1, and 0.25. 



Note when e » 0, the error distribution is bivariate normal with o,| = 
0.5 and p ■ *0.5, 0, 0.5. 

2. A bivariate Student-t with density function 



f (x ,y ) 



ai02 



/1 -P' 



[1 



1 1 
d 2(l-p)^ 



C(4-) - 2p(J!-)(-L) . (_y_)2]]-«o-2)/2) 



"1 "1 "2 2 
where c is a constant and d is the number of degrees of freedom. The 

parameters used are o,j = 0.5, “ 5.0, d»1 , and p = *0.5, 0, 0.5. 
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Each simulation case has 2,000 replications. For each replication a 
BIRD with parameter yj is drawn to sing with probability Pj . An independent 
observational error is drawn for each WATCHER from one of the error 
distributions. The WATCHER observations are combined by using one of the 
procedures below to determine the BIRD that sang. 

1. Median ; Compute the medians of the WATCHER observations of the 
first parameter and of the second parameter. Compute the Euclidean distance 
of the median pair to each BIRD parameter pair. The BIRD whose parameter 
pair has the smallest distance is estimated to be the one that sang. 

2. MLE normal ; The observational errors are assumed to have a 

bivariate normal distribution with parameters - 0.5 - 0.5 and 

correlation coefficient p. The likelihood is calculated for each BIRD 
parameter pair and the BIRD having the largest likelihood is estimated to be 
the BIRD that sang. 

3. MLE CN ; The observational errors are assumed to have an e - 

contaminated normal distribution function with parameters “ 0*5 

and a^i ” ®22 " ^ procedure is the same as 2. 

4. MLE T ; The same as 2 and 3 except the observational errors are 
assumed to have a Student t distribution with parameters - 02 ” 0.5, p 
and d. 

Table 4 shows the proportion of correct identifications for a case in 
which the BIRDS’ parameters are in the LINE configuration. Each BIRD is 
equally likely to sing for each replication. The true error distributions 
simulated were the bivariate normal, the e-contaminated normal with e = 0.1, 
the e~contaminated normal with e 0.25, and the Student t with 1 degree of 
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freedom, (Cauchy); they are listed in the first column of the table; the 

I 

■correlation coefficients of the simulated errors were p - 0.5, 0, and -0.5 
and are listed on the first row of the table. The number of WATCHERS varies 
between 2 and 5. The estimation procedures are listed in the second col.jmn 
of the Table and assumed p ■ 0.5. 

The simulations of Table 5 used the same models and estimation 
procedures as those of Table 4 except that the maximum likelihood procedures 
assumed p » -0.5. A comparison of Tables 4 and 5 indicates that the value 
of the assumed p for the maximum likelihood procedures made little 
difference in the proportion of correct identifications. 

In both Tables a comparison of the proportion of correct 
identifications when the correlation coefficient of the true error 
distribution is RHO » 0.5 with those when RHO » 0 or -0.5 indicates that it 
is more difficult to identify the correct singing BIRD when the errors of 
observation for the BIRD'S two parameters are positively correlated. 

A comparison of the proportion of correct identifications when the 
normal maximum likelihood method is used, with the other methods indicates 
that the normal estimate is the most sensitive to Incorrect assumptions 
concerning the error distribution. As was true in the unlnvariate case, use 
of the normal estimate on data whose true error distribution has longer 
tails than normal can result in decreasing proportions of correct 
identifications as the number of WATCHERS Increases. 

Table 6 reports results of a simulation experiment with models and 
estimation procedures the same as in Table 4 except that BIRD whose 
parameter is (3,3) always sings. Comparison of the results of Tables 4 and 
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6 indicates that the position of the singing BIRD in a pattern can affect 
the chances cf a correct identification. 

Table 7 reports results of a simulation experiment in which the 
parameters and estimates are the same as those of Table 4 except that the 
BIRDS' parameters are in the BOX pattern Instead of the LINE pattern. A 
comparison of Tables 4 and 7 indicates that, if the correlation coefficient 
of the true error distribution is p » 0.5 (respectively p * -0.5), then it 
is easier (respectively harder) to make a correct identification of the 
singing BIRD with the BIRDS' parameters in the BOX pattern. 

4.3 Conclusions from the Simulation Study 

1. Estimation and identification procedures based on assumptions of 
normal errors are sensitive to outlying observations. 

2. Estimation procedures based on assumptions of a long-tailed error 

distribution are more robust to incorrect error distribution assumptions 
than normal estimation procedures. , 

3. Bayes estimation procedures are sensitive to incorrect 
specification of the prior distribution of which BIRD is singing. 

4. The following attributes affect the ability to correctly Identify 
the singing BIRD. 

a. If each BIRD has more than 1 parameter, correlation between the 
parameters' observation errors can influence the difficulty of 
identification of the correct BIRD. 

b. The configuration of the parameter space for the BIRDS can make 
correct identification more difficult, e.g. LINE, BOX. 
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c. The location of the parameter of the singing BIRD in the 
parameter space can make correct identification easier or harder, e.g. 
middle BIRD or end BIRD in the univariate parameter case. 
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I APPENDIX 

HIT PROBABILITY WHEN BIRDS ARE ON A LINEAR LATTICE, ERRORS ARE e- 

CONTAMINATED, AND A LINEAR SUMMARY, NEAREST-NEIGHBOR ALGORITHM IS USED: 

I 

* LINEAR CONSENSUS PROCEDURES NEED ^ SHOW "SAFETY IN NUMBERS". 

1 

n 

I 

( Suppose BIRD characteristics Uj are at equal intervals: 

u, - 0, ±1, ±2, ... with no limit to number. Let the i^^ of I WATCHERS 

I J 

have the e-contaminated error density 



, X -0 2 , X -0 2 

fj<Xji0) . ; exp[- -5 (— ) 1 r- ♦ c exp[- j (— ) ]—. 

}/ 2 v /2it 02 ^ 



(A-1) 



where c + e * 1 , e ^ 0, e S 0. It is known that a BLUE of 0 is 



®BLUE 



I 

I X /o 

i-1 



2 

i 



I' ? 

I 1/0^ 

i-1 • 



(A-2) 



where here 



0 



2 

i 




2 

'^2i 



(A-3) 



and that 



1 






I l/Oi 
i-V 



(A-4) 



Clearly there is a tendency for the above variance to decrease with I, so 

one might conclude that adding more WATCHERS improves hit probability. This 

conclusion is false. Perhaps more surprisingly, existence of theoretical 

population moments does not Seem to govern the behavior of the linear 
^ 2 

estimate, Of course, if doesn't exist then the above weighting 

can not be carried out, but if the error scale is the same for all WATCHERS 
then equal weighting is suggested. It can be seen analytically that the 

A 

Student t with one d.f . (the Cauchy) error model implies that 0„, has 
exactly the same distribution regardless of the number of WATCHERS, and this 
effect is plainly visible from simulations and numerical calculations. On 
the other hand, the e-contaminated Normal/Gauss error model has all moments 
finite, and yet can exhibit a hit probability that decreases with I, later 
of course increasing as it must, eventually, by central limit theorem 
effects. For what is possibly a plausible example, the advantage of number 
becomes evident only after about a dozen WATCHERS are performing 
simultaneously! 
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Simplest Case; Statistically Identical WATCHERS 
2 2 2 2 2 2 

Let » 0 ^, < a^„ To calculate hit probability, assume 
with no loss of generality that BIRD 0 is singing. Then the probability of 
a hit is the probability that the ordinary average of I errors lies between 
- j and + 



, X, + X„+ . . .+ Xt , 
P{HIT} - P[-^ < 1 < 



(A-5) 



Condition on the error components involved: if G represents the number of 

2 

"good” (small variance, a^) observations, and B » 1 - G the number of "bad" 

2 

(large variance, o^) then G - Binomial (e,I) and, given G, 



X^ * ’X-2* 



Gof + (I-G)o? 

N(0.-J = 



(A-6) 



SO 



P{HIT|G=.g} =■ 2^(- 



/(g(o? - al) + Io|)/I' 



-] - 1 






(A-7) 



Consequently, when the condition is removed, 



P{HIT} - i (^)(e)®(e)^"® L t 

g-0 8 



(A-8) 
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Numerical Illustration 



Suppose BIRDS occur as above and that WATCHERS are independent with e- 
contaminated errors having parameters e - .25, • 0.5, 

A 

SO 02 ^ ■ 5. Then the BLUE is the ordinary average, X - 0, which is also a 
Bayes estimate if one were to assume that the error distribution is simple 
Normal and the prior probabilities equal. Adopt the NN approach (what 
else?) to identify the singing BIRD. Then we tabulate the 

HIT PROBABILITIES 

i ■ 

ALGORITHM NUMBER OF WATCHERS, I 

2 4 6 8 1_0 1_2 14 

A 

0 - AVERAGE, NN 0.54 0.48 0.48 0.49 0.51 0.55 0.57 

A 

0 - MEDIAN, NN 0.54 0.77 0.86 0.92 0.94 0.96 0.98 

A 

The effect mentioned is quite striking, with linear 0 hit probability 
awryyKov rff quickly, recovering slowly, and not approaching that of the 
median until a value of I much larger than any in our table is reached. 
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Table 1 



Proportion of Correct Identifications 
BIRDS Equally Likely 

True Error Distribution 
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Table 2 



Proportion of Correct Identifications 
BIRD with PARAMETER 1 SINGS 

True Error Distribution 



Number of WATCHERS: 


Normal 
a -0.5 




Student-t 
1 df 
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Normal 
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Bayes Estimates 
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Table 3 
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True Error Distribution 
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Table 4 



LINE Pattern 
Equally Likely BIRDS 
Proportion of Correct Identifications 



Assumed RH0=0.5 

CORRELATION COEFFICIENT USED TO 
GENERATE THE ERROR 



True 
Error 
Dist . 
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.99 .99 1 


1 1 1 


1 


MEDIAN 


.82 


.91 .95 


.96 


.85 


.93 .96 .98 


.87 .97 .97 


.99 


NORMMLE 


.82 


.81 .83 


.84 


.85 


.84 .83 .83 


.87 .85 .84 


.82 


TMLE Idf 


.87 


.94 .96 


.97 


.92 


.97 .98 1 


.97 .99 1 


1 


TMLE 3df 


.88 


.94 .97 


.98 


.93 


.97 .98 .99 


.97 .99 1 


1 


TMLE lOdf 


.88 


.94 .97 


.97 


.92 


.96 .98 .99 


.95 .99 .99 


1 


CNMLE(e-O.l) 


.89 


.94 .97 


.98 


.93 


.97 .99 1 


.97 .99 1 


1 


CNMLE(e=.25) 


.89 


.94 .97 


.98 


.93 


.97 .99 1 


.97 .99 1 


1 


MEDIAN 


.68 


.85 .86 


.92 


.70 


.87 .88 .93 


.73 .88 .89 


.94 


NORMMLE 


.68 


.67 .65 


.68 


.70 


.67 .67 .65 


.73 .68 .67 


.65 


TMLE Idf 


.82 


.91 .93 


.97 


.86 


.94 .97 .98 


.93 .96 .98 


.99 


TMLE 3df 


.82 


.91 .93 


.96 


.86 


.94 .96 .98 


.92 .95 .98 


.99 


TMLE lOdf 


.80 


.89 .92 


.96 


.85 


.91 .94 .96 


.89 .92 .96 


.98 


CNMLE(e=0.1) 


.83 


.92 .94 


.97 


.87 


.94 .97 .98 


.93 .97 .98 


.99 


CNMLE(e=.25) 


.83 


.92 .94 


.97 


.87 


.95 .97 .98 


.93 .97 .98 


.99 


MEDIAN 


.63 


.79 .82 


.87 


.69 


.83 .86 .91 


.77 .88 .90 


.93 


NORMMLE 


.63 


.63 .65 


.64 


.69 


.68 .70 .69 


.77 .77 .74 


.76 


TMLE Idf 


.78 


.84 .89 


.91 


.80 


.89 .94 .95 


.88 .94 .97 


.98 


TMLE 3df 


.76 


.84 .88 


.91 


.80 


.89 .94 .94 


.88 .94 .96 


.98 


TMLE lOdf 


.75 


.80 .85 


.89 


.79 


.87 .92 .94 


.87 .93 .95 


.97 


CNMLE(e=0.1) 


.76 


.81 .86 


.89 


.80 


.87 .92 .94 


.87 .92 .95 


.97 


CNMLE(e=.25) 


.72 


.82 .87 


.89 


.79 


.87 .92 .94 


.87 .92 .96 


.97 



Table 5 



LINE Pattern 
Equally Likely BIRDS 
Proportion of Correct Identifications 

Assumed p =• “0.5 



CORRELATION COEFFICIENT USED TO 
GENERATE THE ERROR 







RHO-0.5 


RHO=0 


RHO*“ 


‘0.5 




Number 

of 

WATCHERS ; 


2 


3 4 


5 


2 


3 4 


5 


2 3 


4 


5 


True 






















Error 






















Dist. 


Estimate 




















CN(e=0) 


MEDIAN 


.92 


.95 .97 


.98 


.96 


.97 .99 


.99 


1 1 


1 


1 




NORMMLE 


.92 


.97 .98 


.99 


.96 


.99 .99 


1 


1 1 


1 


1 




TMLE Idf 


.89 


.95 .97 


.99 


.95 


.98 1 


1 


.99 1 


1 


1 




TMLE 3df 


.91 


.96 .98 


.99 


.96 


.98 1 


1 


.99 1 


1 


1 




TMLE lOdf 


.91 


.96 .98 


.99 


.96 


.99 .99 


1 


1 1 


1 


1 




CNMLE(e=0.1) 


.89 


.94 .97 


.99 


.95 


.98 1 


1 


1 1 


1 


1 




CNMLE(e=.25) 


.88 


.94 .96 


.98 


.94 


.98 .99 


1 


.99 1 


1 


1 


CN(e-Ool) 


MEDIAN 


.82 


.90 .95 


.96 


.86 


.94 .96 


.98 


.87 .97 


.98 


.99 




NORMMLE 


.82 


.83 .82 


.83 


.86 


.85 .83 


.84 


.87 .87 


.84 


.83 




TMLE Idf 


.88 


.92 .96 


.98 


.93 


.97 .98 


.99 


.98 .99 


1 


1 




TMLE 3df 


.89 


.93 .96 


.98 


.93 


.97 .99 


1 


.98 .99 


1 


1 




TMLE lOdf 


.89 


.93 .96 


.98 


.93 


.97 .98 


.99 


.97 .99 


.99 


1 




CNMLE(e=0.1) 


.87 


.92 .96 


.97 


.93 


.97 .99 


.99 


.98 1 


1 


1 




CNMLE( e=.25) 


.87 


.91 .95 


.97 


.93 


.97 .98 


.99 


.98 1 


1 


1 


CN(e=.25) 


MEDIAN 


.69 


.82 .85 


.92 


.71 


.86 .87 


.94 


.72 .89 


.88 


.95 




NORMMLE 


.69 


.68 .64 


.66 


.71 


.68 .66 


.66 


.72 .67 


.66 


.66 




TMLE Idf 


.82 


.88 .92 


.95 


.87 


.93 .97 


.98 


.93 .97 


.99 


.99 




TMLE 3df 


.83 


.88 .92 


.95 


.87 


.93 .97 


.98 


.92 .97 


.99 


.99 




TMLE lOdf 


.83 


.87 .92 


.96 


.87 


.92 .96 


.97 


.90 .95 


.98 


1 




CNMLE(e=0.1) 


.82 


.88 .92 


.96 


.87 


.94 .97 


.98 


.93 .98 


.99 


1 




CNMLE(e=.25) 


.81 


.88 .91 


.95 


.87 


.94 .97 


.98 


.93 .98 


.99 


1 


T I d.f. 


MEDIAN 


.63 


.79 .82 


.87 


.68 


.83 .85 


.92 


.76 .89 


.91 


.94 




NORMMLE 


.63 


.64 .64 


.63 


.68 


.69 .70 


.70 


.76 .76 


.78 


.77 




TMLE Idf 


.76 


.81 .87 


.91 


.81 


.89 .93 


.96 


.89 .94 


.97 


.98 




TMLE 3df 


.76 


.82 .88 


.91 


.81 


.89 .93 


.96 


.89 .94 


.97 


.98 




TMLE lOdf 


.75 


.81 .87 


.90 


.79 


.87 .92 


.95 


.88 .93 


.97 


.98 




CNMLE(e=0.1) 


.74 


.80 .85 


.88 


.79 


.88 .91 


.95 


.88 .93 


.97 


.98 




CNMLE(e=.25) 


.74 


.80 .85 


.88 


.79 


.88 .91 


.94 


.88 .93 


.97 


.97 



Table 6 



LINE Pattern 
BIRD (3,3) Always Sings 
Proportion of Correct Identifications 

Assumed RHO “0.5 



CORRELATION COEFFICIENT USED TO 
GENERATE THE ERROR 







RHO-0.5 


RHO-0 


RH0=~0.5 


Number 

of 

WATCHERS: 


2 


3 4 


5 


2 3 4 5 


2 3 4 


5 


True 
















Error 


Estimate 














Dist. 
















CN(e“0) 


MEDIAN 


.90 


.93 .97 


.98 


.95 .97 .99 .99 


1 .99 1 


1 




NORMMLE 


.90 


.96 .98 


.99 


.95 .99 .99 1 


1 1 1 


1 




TMLE Idf 


.87 


.94 .97 


.99 


.94 .98 .99 .99 


.99- 1 1 


1 




TMLE 3df 


.89 


.95 .98 


.99 


.95 .99 .99 1 


.99 1 1 


1 




TMLE lOdf 


.89 


.96 .98 


.99 


.95 .99 .99 1 


.99 1 1 


1 




CNMLE(e=0.1) 


.90 


.96 .98 


.99 


.95 .99 .99 1 


.99 1 1 


1 




CNMLE(£-.25) 


.89 


.96 .98 


.99 


.95 .98 .99 1 


.99 1 1 


1 


CN(e“0.1) 


MEDIAN 


.75 


.88 .94 


.95 


.81 .93 .96 .97 


.85 .96 .96 


.99 




NORMMLE 


.75 


.78 .78 


.77 


.81 .80 .79 .80 


.85 .81 .79 


.80 




TMLE Idf 


.84 


.92 .96 


.97 


.90 .96 .98 .99 


.97 .99 1 


1 




TMLE 3df 


.85 


.92 .96 


.98 


.90 .97 .98 .99 


.97 .98 1 


1 




TMLE lOdf 


.84 


.91 .96 


.98 


.89 .96 .98 .99 


.95 .97 .99 


1 




CNMLE(e=0.1) 


.85 


.92 .96 


.98 


.91 .97 .98 .99 


.97 .99 1 


1 




CNMLE(e».25) 


.85 


.92 .96 


.98 


.91 .97 .98 .99 


.97 .99 1 


1 


CN( £=0.25) 


MEDIAN 


.59 


.81 .83 


.90 


.62 .82 .85 .93 


.66 .85 .87 


.93 




NORMMLE 


.59 


.57 .58 


.57 


.62 .59 .57 .58 


.66 .59 .59 


.57 




TMLE Idf 


.78 


.86 .92 


.95 


.84 .91 .96 .98 


.89 .95 .98 


.99 




TMLE 3df 


.78 


.87 .93 


.95 


.83 .91 .95 .98 


.89 .94 .97 


.99 




TMLE lOdf 


.76 


.85 .91 


.94 


.80 .89 .94 .96 


.85 .92 .96 


.97 




CNMLE(e=0.1) 


.79 


.88 .93 


.96 


.84 .93 .96 .98 


.90 .95 .98 


.99 




CNMLE(£=.25) 


.79 


.88 .93 


.96 


.84 .93 .97 .98 


.90 .96 .98 


.99 


T 1 d.f. 


MEDIAN 


.53 


.74 .76 


.85 


.61 .80 .81 .89 


.69 .84 .87 


.93 




NORMMLE 


.53 


.54 .54 


.56 


.61 .63 .61 .62 


.69 .69 .71 


.73 




TMLE Idf 


.71 


.81 .85 


.90 


.76 .86 .90 .94 


.84 .91 .95 


.98 




TMLE 3df 


.70 


.80 .85 


.90 


.75 .86 .90 .94 


.84 .91 .95 


.98 




TMLE lOdf 


.68 


.78 .83 


.89 


.73 .85 .89 .93 


.83 .89 .95 


.98 




CNMLE(e=0.1) 


.70 


.79 .83 


.87 


.75 .84 .88 .92 


.83 .89 .93 


.97 




CNMLE(e=.25) 


.70 


.79 .83 


.88 


.75 .84 .89 .93 


.83 .89 .93 


.97 



Table 7 



BOX Pattern _ 

Equally Likely BIRDS 
Proportion of Correct Identifications 

Assumed RHO =0.5 



CORRELATION COEFFICIENT USED TO 
GENERATE THE ERROR 







RHO=0.5 


RH0=0 


RH0=“0.5 


Number 

of 

WATCHERS : 


2 


3 4 


5 


2 3 4 5 


2 3 4 5 


True 


Estimate 












Error 














Dlst . 














CN( e=0) 


MEDIAN 


.96 


.97 .98 


.99 


.97 .97 .99 .99 


.96 .96 .99 .95 




NORMMLE 


.96 


.98 .99 


1 


.96 .99 1 1 


.94 .98 .99 1 




TMLE Idf 


.96 


.98 .99 


.99 


.96 .98 .99 1 


.93 .97 .98 1 




TMLE 3df 


.96 


.98 .99 


.99 


.96 .98 .99 1 


.94 .97 .99 1 




TMLE lOdf 


.96 


.98 .99 


1 


.96 .99 1 1 


.95 .98 .99 1 




CNMLE(e=0.1) 


.96 


.98 .99 


1 


.96 .98 1 1 


.93 .96 .98 .99 


• 


CNMLE(e=.25) 


.96 


.98 .99 


1 


.95 .98 .99 1 


.92 .96 .98 .99 


CN(e=0.1) 


MEDIAN 


.83 


.93 .96 


.98 


.84 .94 .96 .98 


.84 .93 .96 .97 




NORMMLE 


.83 


.81 .82 


.79 


.84 .83 .81 .82 


.83 .81 .82 .79 




TMLE Idf 


.92 


.96 .98 


.99 


.92 .97 .99 1 


.90 .95 .97 .98 




TMLE 3df 


.92 


.97 .98 


.99 


.93 .97 .99 .99 


.90 .95 .98 .98 




TMLE lOdf 


.90 


.96 .98 


.99 


.91 .97 .98 .99 


.90 .94 .98 .98 




CNMLE(e=0.1) 


.93 


.97 .99 


.99 


.93 .97 .99 .99 


.90 .94 .97 .98 




CNMLE(e=.25) 


.93 


.97 .98 


.99 


.93 .97 .99 .99 


.90 .94 .97 .98 


CN(e=0.25) 


MEDIAN 


.70 


.87 .87 


.92 


.70 .85 .88 .93 


.71 .87 .87 .93 




NORMMLE 


.69 


.63 .63 


.61 


.69 .64 .62 .62 


.70 .64 .60 .61 




TMLE Idf 


.87 


.93 .96 


.98 


.87 .93 .96 .97 


.84 .91 .94 .97 




TMLE 3df 


.87 


.93 .96 


.97 


.87 .93 .95 .97 


.84 .91 .94 .97; 




TMLE lOdf 


.85 


.91 .95 


.96 


.85 .90 .94 .96 


.83 .90 .93 .95- 




CNMLE(e=0.1) 


.88 


.94 .96 


.98 


.87 .93 .96 .98 


.84 .91 .94 .97 




CNMLE(e=.25) 


.88 


.94 .97 


.98 


.87 .93 .96 .98 


.84 .90 .94 .97 


T 1 d.f. 


MEDIAN 


.66 


.83 .86 


.91 


.67 .83 .86 .91 


.66 .83 .85 .91 




NORMMLE 


.67 


.71 .69 


.69 


.66 .67 .66 .66 


.63 .63 .63 .63 




TMLE Idf 


.81 


.88 .92 


.95 


.80 .87 .92 .95 


.77 .86 .90 .93 




TMLE 3df 


.81 


.88 .93 


.95 


.80 .87 .92 .95 


.77 .85 .90 .92 




TMLE lOdf 


.79 


.87 .92 


.94 


.77 .86 .90 .94 


.75 .84 .89 .91 




CNMLE(e=0.1) 


.79 


.87 .91 


.94 


.78 .85 .90 .93 


.76 .85 .88 .91 




CNMLE(e=.25) 


.80 


.87 .91 


.94 


.75 .85 .90 .93 


.75 .84 .88 .91 
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